ggplot2, an introduction

Jim Rose

Lecture for IBS519, Fall 2023

Part 3: What the heck is ggplot2?

ggplot2 is your first introduction to the tidyverse

Tidyverse

  • A collection of R packages serving different functions but following the same principles

  • Designed to work well together

  • Extremely nerdy

Marvel Universe

  • A collection of heros/movies with different skills but following the same principles

  • Frequent crossovers/cameos

  • Extremely nerdy

It is possible to create almost ANY type of graphic using R and ggplot2

The Layered Grammar of Graphics of ggplot2

  • One reason that ggplot2 is so popular is that it uses a consistent framework, or “grammar”, which makes it both intuitive AND flexible

  • Elements of any chart (i.e. axis, points, bars, text etc) are “layered” on top of each other to form the complete graphic

The layered grammar of graphics concept was introduced by Hadley Wickham (@hadleywickham@fosstodon.org) the creator of ggplot2

The Layered Grammar of Graphics of ggplot2

“Good grammar is just the first step in creating a good sentence”

Anatomy of a ggplot

ggplot(data=data, aes(x=C, y=A)) #Layer 1

Tip

Data is always defined in the first layer and is inherited in all subsequent layers

Anatomy of a ggplot

ggplot(data=data, aes(x=C, y=A)) #Layer 1

Tip

Data is always defined in the first layer and is inherited in all subsequent layers

Anatomy of a ggplot

ggplot(data=data, aes(x=C, y=A)) + #Layer 1
  geom_point() #Layer 2

Tip

Layers are added together using the “+” operator

Important

Order matters!

Anatomy of a ggplot

ggplot(data=data, aes(x=C, y=A)) + #Layer 1
  geom_point() + #Layer 2
  geom_line() #Layer 3

Tip

geom_() functions are used in secondary layers to define the geometric objects used in the plot

You can add as many geoms to the plot as you like

Important

But keep in mind the data viz pillars!

Anatomy of a ggplot

ggplot(data=data, aes(x=C, y=A)) + #Layer 1
  geom_point() + #Layer 2
  geom_line() + #Layer 3
  coord_flip() + #Layer 4
  scale_x_continuous(limits=c(1,100)) #Layer 5

Tip

These functions are added to make changes to the default coordinate system or axis scales

They start with the prefixes coord_ or scale_x, scale_y respectively

Anatomy of a ggplot

ggplot(data=data, aes(x=C, y=A)) + #Layer 1
  geom_point() + #Layer 2
  geom_line() + #Layer 3
  coord_flip() + #Layer 4
  scale_x_continuous(limits=c(1,100)) +#Layer 5
  theme_classic() #Layer 6

Tip

Finally, theme() functions can be added to alter the default appearance of the plot

e.g. title, axis titles, font size, background color, etc

Setting properties in geoms

You can change attributes of a layer by specifying them as arguments within the geom() function

Shape codes in R
ggplot(data=data, aes(x=C, y=A)) +
  geom_point(color="red", shape=12, size=8)

Mapping variables to aesthetics using aes()

The aes() function is used to assign variables in your data to graphical parameters (or channels)

  • Position (x,y)
  • Color (fill, color)
  • Shape(shape, linetype)
  • Size (size)
  • Transparency (alpha)
  • Groupings (group)

Important

Syntax note: Variable names do NOT use quotes when used inside aes()

Setting vs Mapping

Setting static attribute

ggplot(data=data, aes(x=C, y=A)) +
geom_point(color="red", size=8)

Mapping to data

ggplot(data=data, aes(x=C, y=A)) +
geom_point(aes(color=D, size=C))

[1] "data"
  A  B  C D
1 2  3  4 a
2 1  2  1 a
3 4  5 15 b
4 9 10 80 b

Globally vs locally mapping variables

Local Mapping

ggplot(data=data, aes(x=C, y=A)) +
geom_point(aes(color=D)) +
geom_line()

Global Mapping

ggplot(data=data, aes(x=C, y=A, color=D)) +
geom_point() +
geom_line()

Note

Aesthetics set within the original layer will be applied to all later layers/geoms

Saving ggplots

ggplots can be stored as objects in R

plot <- ggplot(data=data, aes(x=C, y=A)) +
geom_point(aes(color=D)) +
geom_line()

Use ggsave() to save jpg, png, pdf versions of plots

plot
ggsave(filename="myfirstplot.pdf", plot, height=4, width=4)

Basic geoms

Points

  • geom_point()
  • geom_jitter()

Distributions

  • geom_histogram()
  • geom_density()
  • geom_boxplot()
  • geom_violin()

Statistics

  • geom_smooth()

Barplots

  • geom_bar()
  • geom_col()

Lines

  • geom_abline()
  • geom_hline()
  • geom_vline()

Text

  • geom_text()
  • geom_label()

Scale and Coordinate Modifiers

Titles

Modifies text labels on plot,axis, etc

  • labs()

  • xlab()

  • ylab()

Scales

Change aspects of axis/channel scales

  • scale_x_continuous()

    • _discrete()

    • _manual()

  • scale_color_continuous()

    • _discrete()

    • _manual()

  • scale_fill_continuous()

    • _discrete()

    • _manual()

Coordinate Systems

  • coord_cartesian()
  • coord_flip()
  • coord_polar()

Faceting

  • facet_grid()
  • facet_wrap()

Faceting on the dose variable

Themes

Pre-built themes

Add these to ggplots as quick theme changes

  • theme_grey() default theme in ggplot

  • theme_bw()

  • theme_light()

  • theme_dark()

  • theme_minimal()

  • theme_classic()

  • theme_void()

Themes

Theme function

Extremely customizable

  • theme()

e.g.

ggplot() + theme(axis.text=element_text(size=10))

Remember the pillars!

Googling (& chatGPT!) are your friends

Resources: